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ABSTRACT 

A  4D  variational  data  assimilation  system  was  developed  for  assimilating  ocean  observations  with  the  Navy 
Coastal  Ocean  Model.  It  is  described  in  this  paper,  along  with  initial  assimilation  experiments  in  Monterey 
Bay  using  synthetic  observations.  The  assimilation  system  is  tested  in  a  series  of  twin  data  experiments  to 
assess  its  ability  to  fit  assimilated  and  independent  observations  by  controlling  the  initial  conditions  and/or  the 
external  forcing  while  assimilating  surface  and/or  subsurface  observations.  In  all  strong  and  weak  constraint 
experiments,  the  minimization  of  the  cost  function  is  done  with  both  the  gradient  descent  method  (in  the 
control  space)  and  the  representer  method  (observation  space).  The  accuracy  of  the  forecasts  following  the 
analysis  and  the  relevance  of  the  retrieved  forcing  correction  in  the  case  of  weak  constraints  are  evaluated.  It 
is  shown  that  the  assimilation  system  generally  fits  the  assimilated  and  nonassimilated  observations  well  in  all 
experiments,  yielding  lower  forecast  errors. 


1.  Introduction 

This  paper  describes  the  development  of  a  four¬ 
dimensional  variational  data  assimilation  (4DVAR) 
system  based  on  the  representer  method  for  the  Navy 
Coastal  Ocean  Model  (NCOM).  NCOM  is  an  opera¬ 
tional  ocean  model  (primarily  at  the  Naval  Oceano¬ 
graphic  Office)  that  has  been  validated  (Martin  2000; 
Barron  et  al.  2006),  with  several  references  in  the  liter¬ 
ature.  It  is  a  free-surface  general  circulation  model 
(GCM)  based  on  the  primitive  equations  and  employs 
the  hydrostatic,  Boussinesq,  and  incompressible  ap¬ 
proximations.  It  has  been  used  in  global-  and  basin-scale 
circulation  applications  (Barron  et  al.  2003,  2004;  Kara 
et  al.  2006);  in  coastal  applications,  for  example,  to 
model  the  upwelling  and  relaxation  events  in  Monterey 
Bay  (Shulman  et  al.  2007);  and  river  discharge  (Morey 
et  al.  2003)  and  river  plume  modeling  (Liu  et  al.  2009); 
and  in  modeling  air-sea  interactions  through  coupling 
with  atmospheric  models  (Pullen  et  al.  2006,  2007). 
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Other  applications  include  particle  transport  (Haza 
et  al.  2007;  Schroeder  et  al.  2011),  the  tracking  of  oil 
spills  in  Australia  (Brushett  et  al.  2011)  and  in  the  Gulf 
of  Mexico  (Cheng  et  al.  2011),  the  tracking  of  eddies  and 
filaments  (Burrage  et  al.  2009),  the  study  of  bottom 
Ekman  tidal  flows  (Book  et  al.  2009),  and  the  coupling  to 
ecosystem  (Jolliff  et  al.  2009)  and  biochemical  and  bio- 
optical  models  (Shulman  et  al.  2011). 

Other  models  of  the  complexity  of  NCOM  have  seen 
similar  4DVAR  development  efforts  undertaken  in  the 
past  decade.  A  4DVAR  assimilation  system  was  devel¬ 
oped  for  the  Ocean  Parallelise  (OP A)  model  (Weaver 
et  al.  2003),  for  the  Massachusetts  Institute  of  Technology’s 
(MIT)  general  circulation  model  (MITgcm;  Marotzke 
et  al.  1999)  also  used  in  the  Estimation  of  the  Circulation 
and  Climate  of  the  Ocean  (ECCO)  consortium  assimila¬ 
tion  experiments  (Stammer  et  al.  2002),  and  a  similar  sys¬ 
tem  was  built  for  the  Regional  Ocean  Model  System 
(ROMS;  Moore  et  al.  2004).  Unlike  the  other  models  using 
fixed  z  levels  (OPA  and  MITgcm)  or  s  coordinates 
(ROMS),  NCOM  uses  a  hybrid  vertical  coordinate  that 
combines  dynamic  sigma  layers  in  the  upper  ocean  and 
fixed  z  levels  below,  or  a  generalized  vertical  coordinate 
with  dynamic  sigma  layers,  static  sigma  layers,  and  z  levels. 
Dynamic  sigma  layers  evolve  with  the  free  surface  and 
static  sigma  layers  are  prescribed  a  fixed-depth  fraction 
that  is  kept  constant  throughout  a  given  model  simulation. 
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It  is  a  common  practice  to  test  a  recently  developed 
assimilation  system  with  identical  twin  experiments  in 
which  the  observations  are  simulated  by  the  numerical 
model.  Twin  experiments  allow  us  to  test  not  only  the 
system’s  ability  to  fit  the  assimilated  observations,  but 
also  to  test  the  accuracy  of  the  analysis  (and  its  associ¬ 
ated  forecast)  away  from  the  observation  locations, 
the  adequacy  of  the  prescribed  error  covariances,  and 
the  accuracy  and  relevance  of  the  retrieved  corrections 
of  the  initial  conditions,  and  the  external  forcing  in  the 
case  of  weak  constraint. 

All  of  these  consistency  and  accuracy  tests  are  carried 
out  in  Monterey  Bay.  The  region  is  chosen  for  the  nu¬ 
merous  field  experiments  that  have  been  conducted  there 
along  with  the  array  of  moored  buoys  that  have  collected 
data  over  a  long  period  of  time  The  region  provides  an 
abundant  dataset  that  will  be  used  for  evaluating  and 
validating  the  assimilation  system  with  real  observations 
in  the  second  part  of  this  study,  this  first  part  consisting 
of  the  system  description  and  twin  experiments. 

The  4DVAR  system  is  implemented  with  the  flexi¬ 
bility  of  running  either  strong  or  weak  constraint  as¬ 
similation  experiments.  The  strong  constraint  uses  the 
gradient  descent  method  for  minimizing  the  cost  func¬ 
tion.  The  weak  constraint  is  based  on  the  representer 
method  (Bennett  1992, 2002)  in  which  the  solution  to  the 
assimilation  problem  is  sought  as  the  sum  of  a  first  guess 
and  a  finite  linear  combination  of  representer  functions, 
one  per  datum.  The  computation  and  storage  of  all  the 
representer  functions  is  avoided  by  using  the  indirect 
representer  method  (Amodei  1995;  Egbert  et  al.  1994; 
Bennett  et  al.  1996;  Ngodock  et  al.  2000).  The  formu¬ 
lation  of  the  assimilation  problem  allows  for  the  in¬ 
clusion  of  a  model  error  term  (i.e.,  weak  constraint)  at 
no  extra  computational  cost  to  the  minimization  pro¬ 
cess.  In  this  paper,  an  assessment  will  be  performed  of 
the  system’s  ability  to  fit  both  assimilated  observations, 
as  well  as  the  consistency  of  the  retrieved  model  forcing. 

There  are  no  specific  applications  of  4DVAR  in 
Monterey  Bay,  let  alone  its  weak  constraint  formulation. 
Strong  constraint  variational  assimilation  (Broquet  et  al. 
2009)  has  been  applied  to  the  California  Current  System 
CCS),  including  an  application  to  estimate  surface 
forcing  correction  (Broquet  et  al.  2011),  using  the  in¬ 
verse  Regional  Ocean  Modeling  System  (IROMS;  Di 
Lorenzo  et  al.  2007)  with  horizontal  resolutions  of  10 
and  30  km.  The  CCS  is  a  large  area  that  includes  Mon¬ 
terey  Bay,  although  these  applications  did  not  specifically 
target  the  bay.  Most  of  the  assimilation  experiments  that 
have  been  carried  for  Monterey  Bay  were  based  on  se¬ 
quential  methods  such  as  3D  VAR  and  ensemble-based 
Kalman  filters  (Chao  et  al.  2009;  Haley  et  al.  2009; 
Shulman  et  al.  2009). 


A  brief  description  of  the  numerical  model  is  pre¬ 
sented  in  the  next  section,  followed  by  the  4DVAR 
system  derivation  and  implementation  in  section  3. 
Section  4  deals  with  the  experiments’  setup  and  results, 
and  concluding  remarks  follow  in  section  5. 

2.  The  model 

NCOM  is  described  in  the  literature  (Martin  2000; 
Barron  et  al.  2006).  The  model  domain  used  for  this 
experiment  contains  the  Monterey  Bay,  California,  re¬ 
gion.  This  location  is  favorable  for  ocean  modeling  due 
to  its  strong  land-sea-breeze  circulation  patterns,  com¬ 
plex  coastline  with  steep  topography,  and  the  existence 
of  frequent  local  upwelling  and  relaxation  events 
(Shulman  et  al.  2002).  The  domain  covers  35.6°-37.49°N 
and  121.38°-123.2°W  with  a  horizontal  resolution  of 
2  km  and  50  layers  in  the  vertical.  The  initial  conditions 
were  obtained  from  downscaling  the  operational  Vs0 
resolution  global  NCOM  (GNCOM)  to  an  intermediate 
model  with  horizontal  resolution  of  6  km,  and  then  via 
a  3-to-l  nesting  ratio  to  the  2-km  model.  Horizontal 
viscosities  and  diffusivities  are  computed  using  either 
the  grid-cell  Reynolds  number  (Re)  or  the  Smagorinsky 
schemes,  both  of  which  tend  to  decrease  as  resolution 
is  increased.  The  grid-cell  Re  scheme  sets  the  mixing 
coefficient  K  to  maintain  a  grid-cell  Re  number  below 
a  specified  value  (e.g.,  if  Re  =  u  X  A xlK  =  30,  then  K  = 
u  X  Ax/30).  Hence,  as  Ax  decreases,  K  decreases  pro¬ 
portionally.  A  similar  computation  is  performed  for  the 
Smagorinsky  scheme. 

Surface  boundary  conditions  (e.g.,  wind  stress,  IR 
radiation  flux,  etc.)  are  provided  by  the  Coupled  Ocean- 
Atmosphere  Mesoscale  Prediction  System  (CO AMPS; 
Hodur  1997),  a  regional  relocatable  atmospheric  model, 
which  in  turns  gets  boundary  conditions  from  the  global 
Navy  Operational  Global  Atmospheric  Prediction  Sys¬ 
tem  (NOGAPS;  Goerss  and  Phoebus  1992;  Rosmond 
1992).  The  regional  atmospheric  model  can  be  run  at  a 
user-defined  horizontal  resolution  through  a  sequence 
of  nests.  Here,  the  CO  AMPS  resolution  is  set  to  match 
the  ocean  model  resolution  of  2  km,  and  the  forcing 
fields  are  archived  and  used  in  the  model  every  3  h.  Open 
boundary  conditions  use  a  combination  of  radiative 
models  and  prescribed  values  provided  by  the  parent 
nest.  Different  radiative  options  are  used  at  the  open 
boundaries  depending  on  the  model  state  variables: 
a  modified  Orlanski  radiative  model  is  used  for  the 
tracer  fields  (temperature  and  salinity),  an  advective 
model  for  the  zonal  velocity  u ,  a  zero  gradient  condi¬ 
tion  for  the  meridional  velocity  v  as  well  as  the  baro- 
tropic  velocities,  and  the  Flather  boundary  conditions 
for  elevation. 
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3.  The  4DVAR  system 

a.  The  cost  function 

For  the  sake  of  clarity,  the  model  equations  given  in 
the  appendix  in  discretized  form  are  rewritten  in  a  sim¬ 
pler  form  here: 

f  n  V 

—  =  F(X)  +  f,  0  <t<T 

^  ,  (1) 

X(t  =  0)  =  10)  +  iO) 

where  X  stands  for  all  the  dependent  model  state 
variables — two-dimensional  sea  surface  height  (SSH) 
and  barotropic  velocities,  and  three-dimensional  tem¬ 
perature,  salinity,  and  baroclinic  velocities;  F  is  the 
model  tendency  terms  on  the  right-hand  sides  of  (Al)- 
(A5)  and  (A10)-(A12)  (see  the  appendix);  f  is  the  model 
error,  a  function  of  the  independent  variables  (x  and  t ) 
of  the  space-time  domain  fl  with  covariance  Cf. ;  I(x)  is 
the  prior  initial  conditions;  and  i(x)  is  the  initial  condition 
error  with  covariance  Q.  Given  a  vector  Y  of  M  obser¬ 
vations  of  the  model  state  in  the  space-time  domain,  with 
the  associated  vector  of  observation  errors  s  (with  co- 
variance  Cg), 

Y  =  H  X  +  g  ,  1  <  m  <  M ,  (2) 

where  Hm  is  the  observation  operator  associated  with 
the  rath  observation,  one  can  define  a  weighted  cost 
function: 


J  = 


j  j  |  [  f(x,  £)Wf(x,  t,x',  t')i(x',  t')  dx'  dt'  dxdt 

Jo  Jo  Jo  Jo 


+  [  [  i(x)Wi(x,x/)i(x/)  dx'  dx  +  £TWg£,  (3) 

JoJo 


where  H  denotes  the  model  domain;  the  weights  Wf 
and  Wi  and  are  defined  as  inverses  of  Cf  and  Q,  re¬ 
spectively,  in  a  convolution  sense;  and  W£  is  the  matrix 
inverse  of  C£.  The  latter  is  usually  considered  to  be 
a  diagonal  matrix,  from  the  assumption  that  observa¬ 
tion  errors  are  uncorrelated.  Boundary  condition  er¬ 
rors  are  omitted  from  (1)  and  (3)  only  for  the  sake  of 
clarity.  The  model  error  covariance  is  assumed  to  take 
the  form 


Cf(x,  t,x’,t') 


=  v(x)1/2v(x')1/2  exp 


exp 


(4) 


where  \(x)  is  the  error  variance  and  L  and  r  are  the 
length  and  time  scales,  respectively.  The  initial  error 
covariance  Ct  also  assumes  the  form  of  (4)  with  the 
exception  of  the  time  correlation  term  and  different 
(higher)  variance.  Horizontal  correlations  in  (4)  are 
obtained  by  solving  a  diffusion  equation  (Derber  and 
Rosati  1989;  Egbert  et  al.  1994;  Weaver  and  Courtier 
2001),  while  the  time  correlation  is  obtained  by  solv¬ 
ing  a  pair  of  coupled  Langevin  equations  (Chua  and 
Bennett  2001;  Bennett  2002;  Ngodock  2005).  Corre¬ 
lations  in  (4)  are  univariate  and  are  implemented 
layer  by  layer  for  each  model  state  variable.  The  cross 
correlations  are  provided  by  the  model  dynamics 
through  the  integration  of  the  adjoint  and  the  tangent 
linear  models. 

b.  The  minimization 

The  solution  of  the  assimilation  problem  [i.e.,  the 
minimization  of  the  cost  function  (3)]  is  achieved  by 
solving  the  following  Euler-Lagrange  (EL)  system: 


d-f  =  F(X)  +  Cf- A,  0 
X(f  =  0)  =  I(x)  +  C,  o  A(x,  0) 

T  MM 

A+  I  lWsmn(ym-HmX)S(x-xm)S(t-tm),  0 

m= 1 n= 1 

U(U  =  o 


dA 

~dt 


S(x) 


where  A  is  the  adjoint  variable  defined  as  the  weighted 
residual 

A  (x,t)=  [  [  W f(x,t,x' ,tf)f(x' ,tf)  dx' dt' 

Jo  Jo 


8  denotes  the  Dirac  delta  function,  W^mn  are  the  matrix 
elements  of  W£,  the  superscript  T  denotes  the  trans¬ 
position,  and  the  covariance  multiplication  with  the 
adjoint  variable  is  the  convolution 
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Cj  •  A(x,  t)  =  f  f  Cj>(x,  t ,  x',  t')X(x',  t ')  dx'  dt' 
j  o  j  n 

for  the  model  error  and 


CfoA(*,0) 


C.(x,x')X(xr  ,0)  dx' 
n 


for  the  initial  condition  error.  Note  that,  although  the 
cost  function  is  written  with  the  inverse  of  the  co- 
variance  functions,  the  actual  inverses  are  not  needed  in 
the  Euler-Lagrange  equations  in  (5)  associated  with  the 
minimization  of  (3). 


1)  Gradient  descent 

By  omitting  the  model  error  term  in  (1)  or  setting  Cfto 
zero  in  (5),  one  recovers  a  familiar  EL  for  strong  con¬ 
straint  variational  data  assimilation  that  is  solved  by  it¬ 
erative  algorithms  such  as  the  gradient  descent.  Starting 
from  a  first  guess  of  the  initial  conditions,  the  nonlinear 
model  is  integrated  forward  in  time  and  substituted  into 
the  adjoint  model  that  is  integrated  backward  in  time, 
and  from  which  the  increments  or  corrections  to  the 
initial  conditions  are  computed.  The  new  initial  condi¬ 
tions  are  used  in  a  new  iteration  of  the  algorithm  until 
the  selected  convergence  criterion  is  satisfied.  The  same 
process  can  be  used  with  the  inclusion  of  the  model  er¬ 
ror,  except  that  the  increase  of  the  control  space  also 


means  the  increase  of  the  computational  expense,  and 
a  potentially  ill-conditioned  minimization  problem. 

2)  The  representer  method 

Allowing  the  model  error  increases  the  dimension  of 
the  control  space  and  the  computational  cost  of  the  as¬ 
similation  and,  usually,  renders  the  minimization  pro¬ 
cess  poorly  conditioned.  This  difficulty  may  be  avoided 
if  the  minimization  is  done  in  the  data  space,  which  does 
not  depend  on,  and  is  usually  smaller  than,  the  control 
space.  That  is  possible  through  the  representer  algo¬ 
rithm,  which  expresses  the  solution  of  the  EL  system  as 
the  sum  of  a  first  guess  and  a  finite  linear  combination  of 
representer  functions,  one  per  datum.  This  cannot  be 
applied  to  (5)  directly  mainly  because  of  its  nonlinear 
property.  However,  following  Ngodock  et  al.  (2000)  and 
Bennett  (2002),  the  representer  algorithm  can  be  ap¬ 
plied  to  a  linearized  form  of  (5),  which  is  obtained  either 
by  linearizing  (5)  directly  or  by  linearizing  the  forward 
model  (1)  and  deriving  an  EL  associated  with  the  cost 
function  based  on  the  linearized  forward  model.  The 
iterative  process  by  which  the  solution  of  the  linearized 
EL  becomes  the  background  for  the  next  linearization 
until  formal  convergence,  is  known  as  the  “outer  loop;” 
while  the  “inner  loop”  consists  of  actually  solving  the 
linear  EL  system.  In  either  case,  given  a  background 
model  solution  X°  around  which  the  model  is  linearized, 
one  can  write  a  linear  EL  system  in  the  form 


—  =  F(Xk~1)  + 
dt  v  ’ 

—  n\  —  i 


ax^x  ' 


(X* 


rk~  1 


)  +  Cyr  •  A,  0  <  t  <  T , 


Xk{t  =  0)  =  I(x)  +  C.  °  A(x,  0) 

T  MM 


dX 

~dt 

U(E)  =  0 


(6) 


*-1'  I  X"Wym-H mXk)S(x-xJS(t-tm),  0 

m= 1 n— 1 


where  k  denotes  the  outer  loop  index  and  X^  is  the  EL  t\  =  xk(  A  +  Y  Rk  k  (  A  (1\ 

solution  after  the  kth  outer  loop.  The  EL  system  (6)  is  F  m=1 

a  linear  coupled  system  between  the  adjoint  and  state  where  X^  is  a  first-guess  solution,  fikm  are  the  coefficients, 
variables.  The  representer  method  uncouples  the  system  and  r^(x,  t),  1  <  m  <  M,  are  the  representer  functions 
by  expanding  the  solution  as  defined  by 


dt 


dF 


ax 


akm  +  HTS(x-xm)8(t-tm),  0 


<(T)=  0 


drk 

J'  m 

dt 


=  F(Xk~1) 


dF_ 

ax 


(X*-1) 


rkm  +  Cf-akm,  0 


rkm(t  =  0)  =  Ci°akm(x,0) 


(8) 
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Fig.  1.  Time  evolution  of  the  magnitude  of  the  perturbation  to 
the  temperature,  salinity,  u,  v ,  and  SSH  fields  normalized  by  the 
magnitude  of  their  respective  initial  perturbations. 

Note  that  the  equations  in  (8)  are  only  weakly  coupled, 
since  the  QLkm,  also  known  as  the  adjoint  representer 
functions,  depend  only  on  the  observation  locations  and 
can  be  computed  independently  of  the  rkm.  The  first  guess 
in  (7)  can  be  chosen  as  the  previous  outer-loop  solution 
X^-1,  or  the  tangent  linear  solution  around  X^_1.  It  may 
be  shown  that  the  representer  coefficients  are  computed 
by  solving  a  linear  system  in  data  space  involving  the 
representer  matrix,  the  data  error  covariance  matrix, 
and  the  innovation  vector: 

(R*  +  Ca)j3  =  Y-HX£,  (9) 

where  is  the  representer  matrix  with  elements 
R^nn  =  rm(xn ,  tn),  that  is,  the  rath  representer  function 
evaluated  at  the  nth  observation  space-time  location 
(xn,  tn).  The  entire  representer  matrix  need  not  be  com¬ 
puted  explicitly  since  the  linear  system  (9)  can  be  solved 
using  an  iterative  algorithm  (e.g.,  the  conjugate  gradient), 
by  taking  advantage  of  the  symmetry  of  each  matrix  in¬ 
volved.  Also,  the  representer  coefficients  constitute  the 
right-hand  side  of  the  adjoint  equation  in  the  EL  system. 
Thus,  once  the  representer  coefficients  are  computed, 
they  are  substituted  into  the  adjoint  equation,  which  is 
then  solved  and  substituted  into  the  forward  linear 
equation  for  the  final  solution.  A  background  solution 
around  which  the  model  is  linearized  is  needed.  Usually, 
it  is  the  solution  of  the  nonlinear  model.  For  the  first- 
guess  solution,  one  may  consider  either  the  background 
or  the  tangent  linear  solution  around  the  background. 
Also,  the  new  optimal  solution  may  replace  the  back¬ 
ground  for  another  minimization  process  (i.e.,  outer 
loops)  until  formal  convergence  (Bennett  et  al.  1996; 
Bennett  2002;  Ngodock  et  al.  2000,  2007,  2009). 


Fig.  2.  Vertical  distribution  of  the  magnitude  of  the  model  errors 
normalized  by  their  surface  value. 


c.  Linearization  and  adjoint  derivation 

The  generation  of  both  the  tangent  linear  and  adjoint 
codes  of  the  model  using  the  Parametric  FORTRAN 
compiler  (PFC;  Erwig  et  al.  2007),  including  the  symmetry 
test  between  the  two  codes,  was  described  in  Ngodock 
and  Carrier  (2013).  Here,  the  stability  of  the  linearized 
model  is  assessed  by  the  time  evolution  of  small  pertur¬ 
bations:  the  tangent  linear  model  is  initialized  by  three- 
dimensional  perturbations  of  the  temperature,  salinity, 
and  velocity  fields,  as  well  as  two-dimensional  perturba¬ 
tions  of  the  surface  elevation  field.  These  initial  pertur¬ 
bations  are  generated  randomly  and  assigned  a  magnitude 
of  2K  for  temperature,  0.3 psu  for  salinity,  0.09 ms-1  for 
velocity,  and  0.01m  for  surface  elevation.  At  each  time 
step  the  norms  of  the  perturbed  fields  are  computed  and 
divided  by  the  norms  of  their  respective  initial  perturba¬ 
tions.  Results  in  Fig.  1,  plotted  every  6h  and  starting  6h 
into  the  integration,  show  that  the  linear  perturbations  are 
stable  and  bounded  for  about  16  days  before  they  start  to 
grow  exponentially. 

d.  Error  standard  deviations:  v(x)1/2 

Assigning  model  errors  and  prescribing  their  co- 
variances  is  the  most  difficult  task  in  data  assimilation 
(Daley  1992;  Talagrand  1999;  Bennett  2002;  Wunsch 
2006).  Not  only  are  there  many  error  sources  (external 
forcing,  initial  and  boundary  conditions,  bad  parameter¬ 
ization,  empirical  formulation,  unresolved  processes), 
but  also  the  errors  cannot  be  measured.  Therefore,  one 
can  only  make  assumptions  about  them.  Since  NCOM 
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Fig.  3.  (a)  The  model  domain  with  bathymetry  contours  and  the  horizontal  observation  locations  (white  squares);  the  model  layer  index 

vs  observation  depths  for  (b)  the  upper  500  m  and  (c)  below  500  m. 


includes  all  resolvable  processes  and  subgrid-scale  pa¬ 
rameterization,  errors  are  attributed  to  the  initial  condi¬ 
tions  and  external  forcing  for  all  the  dynamical  equations, 
and  the  derivation  of  their  estimates  is  given  below.  Note 
that  there  is  no  external  forcing  applied  to  the  continuity 
equation,  and  thus  it  is  not  assigned  a  model  error  either, 
as  in  Jacobs  and  Ngodock  (2003). 

Consider  the  momentum  equation  in  its  non- 
discretized  form: 

—  +P  1F//z,  (10) 

where  F  represents  the  wind  stress  atmospheric  forcing 
(N  m-2),  the  volume  flux  source,  or  the  tidal  potential;  h 


is  a  typical  water  depth;  and  p  is  the  water  density.  The 
model  error  at  the  surface  consists  of  errors  in  the  wind 
stress.  For  the  subsurface,  errors  are  assumed  to  arise 
from  the  volume  flux  and  the  tidal  potential  terms.  Er¬ 
rors  are  considered  to  be  high  in  magnitude  at  the  sur¬ 
face  and  decreasing  with  depth.  Although  the  wind 
stress  varies  in  space  and  time,  its  associated  error  is 
assumed  to  be  uniform  in  the  horizontal  directions.  The 
error  magnitude  is  considered  to  be  10%  of  the  actual 
wind  stress  at  the  surface  and  decreasing  with  depth  in 
order  to  mimic  the  decreasing  impact  of  wind  stress  with 
depth.  Two  terms  contribute  to  the  forcing  for  the 
temperature  equation:  the  net  longwave,  latent,  and 
sensible  heat  fluxes  on  one  hand,  and  the  solar  radiation 


Table  1.  List  of  all  assimilation  experiments  and  their  associated  settings  indicated  by  the  checked  boxes. 


Expl 

Exp2 

Exp3 

Exp4 

Exp5 

Exp6 

Exp7 

Exp8 

Exp9 

Wrong  initial  conditions 

X 

X 

X 

X 

X 

X 

X 

X 

August  forcing 

July  forcing 

X 

X 

X 

X 

X 

X 

X 

X 

X 

30%  forcing  perturbation 

2  X  model  error  std  devs 

X 

X 

X 

X 

X 

X 

Original  simulated  data  sampling  and  error  covariances 
Original  simulated  data  sampling  (surface  only)  and 

X 

X 

X 

X 

X 

X 

X 

X 

error  covariances 

Twin  data  sampling  and  error  covariances  as  in 
real-data  experiments 

3-day  assimilation  window 

X 

X 

Strong  constraint 

Weakly  strong  constraint 

Weak  constraint 

X 

X 

X 

X 

X 

X 

X 

X 

X 
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SSH  with  surface  velocity  vectors  for  the  (a),(b)  control  and  (c),(d)  free-running  solutions  at  the  beginning 
and  end  of  the  assimilation  window. 


on  the  other  hand.  Both  are  assumed  to  be  10%  in  error 
and  the  sum  of  their  errors  constitutes  the  forcing  error 
in  the  temperature  equation,  with  a  spatial  distribution 
similar  to  the  one  used  for  the  errors  in  the  momentum 


equation.  A  similar  approach  is  taken  for  the  errors  in 
the  salinity  equation,  where  the  forcing  consists  of  the 
river  inflow  and  evaporation  minus  precipitation.  Forc¬ 
ing  terms  here  are  also  considered  to  be  10%  in  error. 
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Fig.  5.  As  in  Fig.  4,  but  for  (c),(d)  assimilated  solutions  at  the  end  of  the  first  cycle  (6  Aug)  and  end  of  the  assimilation  window  (31  Aug) 

from  Expl. 
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Fig.  6.  Time  series  of  the  fit  to  the  observations  (red  line)  for  (a)  temperature,  (b)  salinity, 

(c)  SSH,  and  (d)  velocity. 


The  distribution  of  the  model  errors  with  depth  in  the 
momentum  and  tracer  equations  follows  the  normalized 
profile  in  Fig.  2,  where  errors  at  the  surface  have  the 
10%  magnitude  of  the  forcing  described  above.  The 
spatial  and  temporal  correlation  scales  in  (4)  are  set  to 
20  km  and  30  h,  respectively. 

4.  Experiment  setup  and  results 

The  model  was  initialized  on  1  July  2003  and  ran  for  3 
months  to  30  September  2003.  A  portion  of  the  model 
trajectory  that  corresponds  to  the  time  window  1-30 
August  is  considered  the  control  solution,  and  will  also 
be  referred  to  as  the  true  solution  or  simply  the  “truth.” 
Several  1-month-long  assimilation  experiments  are 
carried  out  in  order  to  evaluate  the  ability  of  the  system 
to  correct  the  initial  conditions  alone,  the  external 
forcing  alone,  both  the  initial  conditions  and  external 
forcing,  and  the  model  error  defined  in  the  subsurface  as 
a  volume  source. 

A  1 -month  assimilation  window  is  selected  for  1-30 
August,  a  time  interval  during  which  observations  are 
sampled  from  the  control  solution.  The  simulated  ob¬ 
servations  are  saved  and  utilized  in  the  4DVAR  analysis 
daily  (i.e.,  at  intervals  of  24  h).  There  are  28  uniformly 


distributed  observation  stations  for  SSH  and  profiles  of 
velocity,  temperature,  and  salinity  across  the  model 
domain,  as  seen  in  Fig.  3.  Each  profile  is  represented  on 
the  model’s  vertical  grid  of  49  layers.  Neighboring  ob¬ 
servations  are  separated  horizontally  from  each  other  by 
30  km.  The  temperature,  salinity,  velocity,  and  SSH 
observation  errors  are  set  to  0.4°C,  0.1  psu,  0.05 ms-1, 
and  0.05  m,  respectively,  and  are  held  constant  through 
the  entire  assimilation  window.  These  observation  er¬ 
rors  are  purposefully  set  low  to  test  the  assimilation’s 
ability  to  reduce  large  discrepancies  with  the  model 
(i.e.,  to  drive  the  model  with  large  errors  to  fit  obser¬ 
vations  with  small  errors).  Within  the  month-long  win¬ 
dow,  the  assimilation  is  carried  out  in  cycles  of  5  days, 
where  the  analysis  at  the  end  of  one  cycle  becomes  the 
initial  conditions  for  the  next  cycle.  A  total  of  nine  as¬ 
similation  experiments  (summarized  in  Table  1)  are 
carried  out,  using  strong  to  weak  constraints.  They  are 
designed  to  test  several  parameters  of  the  assimilation 
system. 

a.  Controlling  only  the  initial  conditions 

The  wrong  initial  conditions  are  generated  as  follows: 
first,  the  solution  on  5  July  is  used  to  restart  the  model  on 
15  July  and,  then,  the  model  is  integrated  for  16  days 
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Fig.  7.  As  in  Fig.  6,  but  for  Exp2. 


using  the  atmospheric  forcing  corresponding  to  15-31 
July.  This  yields  a  set  of  initial  conditions  on  1  August 
that  are  completely  different  from  the  control  solution 
on  the  same  date,  and  thus  a  solution  for  1-31  August 


that  is  different  from  the  control  solution  for  the  same 
period,  as  seen  in  Fig.  4. 

Because  the  difference  between  the  two  solutions  is 
solely  due  to  errors  in  the  initial  conditions,  a  strong 
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Fig.  8.  As  in  Fig.  5,  but  for  Exp3. 
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Fig.  9.  As  in  Fig.  6,  but  for  Exp3. 


constraint  assimilation  approach  is  used  (Expl),  in 
which  the  initial  condition  errors  for  temperature,  sa¬ 
linity,  velocity,  and  SSH  are  set  to  2°C,  0.5  psu,  0.5  m  s-1, 
and  0.2  m,  respectively.  These  initial  condition  (IC)  er¬ 
rors  are  kept  the  same  for  the  first  two  assimilation  cy¬ 
cles,  then  progressively  decrease  by  20%  every  cycle 
after  the  second,  as  the  assimilation  is  expected  to  im¬ 
prove  the  initial  conditions  for  the  following  cycle.  The 
IC  errors  are  initially  large  to  reflect  the  significant  dis¬ 
crepancy  between  the  free-running  and  the  control  so¬ 
lutions  at  the  beginning  of  the  assimilation  window. 

Results  from  the  strong  constraint  assimilation  in  Fig.  5 
show  that  there  are  still  some  noticeable  differences  be¬ 
tween  the  assimilated  and  the  control  solutions  after  the 
first  cycle,  especially  in  the  SSH  and  surface  velocities. 
However,  by  the  end  of  the  assimilation  window  (i.e., 
after  the  sixth  cycle),  the  differences  between  the  two 
solutions  are  minor  and,  for  the  most  part,  within  the 
respective  observation  errors.  This  is  confirmed  by  the  fit 
to  the  observation  metric: 


1 


M 


7  FIT 


i 


ly» 


H  X" 


(11) 


where  ym  is  the  rath  observation,  M  is  the  number  of 
observations,  Hm  is  the  observation  operator,  Xa  is  the 


assimilated  solution  or  analysis,  and  <rm  is  the  observa¬ 
tion  error.  The  right-hand  side  of  (11)  is  computed  as 
a  time  series  for  each  observation  type,  and  also  evalu¬ 
ated  for  the  free-run  solution  and  the  first  guess.  The 
latter  is  the  forecast  obtained  by  integrating  the  non¬ 
linear  model  for  5  days,  initialized  by  the  analysis  at  the 
end  of  the  cycle. 

Because  the  assimilation  is  expected  to  fit  the  obser¬ 
vations  to  within  the  observation  standard  deviations, 
the  right-hand  side  of  (20)  is  expected  to  be  less  than 
one  for  the  analysis,  but  not  necessarily  for  the  first 
guess.  It  can  be  seen  in  Fig.  6  that  the  free-running  so¬ 
lution  is  far  from  the  observations,  by  as  much  as  two 
standard  deviations  in  temperature  end  velocity,  due 
especially  to  the  wrong  initial  conditions  being  chosen. 
In  general,  the  assimilation  fits  the  observations  within 
a  standard  deviation,  except  for  the  temperature  in  the 
first  cycle  and  the  velocity  in  the  first  three  cycles. 
There  is  also  a  significant  improvement  in  the  forecast, 
which  remains  within  an  observation  standard  de¬ 
viation  after  the  first  two  cycles.  After  the  first  cycle, 
forecast  (and  thus  analysis)  errors  remain  low  due  to 
the  right  forcing,  except  in  the  velocity.  The  latter 
display  some  inaccuracies  resulting  from  the  end  of 
the  first  cycle  that  negatively  affect  both  the  forecast 
and  the  analysis  in  the  second  cycle.  However,  the 
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Fig.  10.  Profile  time  series  of  (a),(c)  absolute  temperature  and  (b),(d)  salinity  differences  with  the  true  solution  from  the  (top) 

free-running  and  (bottom)  Exp3  solutions. 


assimilation  corrects  those  inaccuracies  in  the  third  and 
subsequent  cycles. 

Weakly  strong  constraint 

It  is  also  possible  to  carry  out  a  strong  constraint  ex¬ 
periment  within  the  representer  method  by  setting  the 
model  error  term  to  zero  in  the  linearized  EL.  This  ex¬ 
periment  is  referred  to  as  the  weakly  strong  constraint 
(Exp2),  because  the  correction  of  the  model  trajectory  is 
based  on  the  linearized  dynamics  [see  (6)]  contrary  to 
the  original  strong  constraint  that  uses  nonlinear  dy¬ 
namics.  The  weakly  strong  approach  has  the  potential  to 
be  computationally  less  expensive  when  the  number  of 
observation  being  assimilated  is  significantly  less  than 
the  dimension  of  the  initial  conditions.  The  fit  to  the 
observations  metric  plot  in  Fig.  7  shows  that  the  weakly 
strong  constraint  assimilation  fits  the  observations  simi¬ 
larly  well  compared  to  the  strong  constraint.  There  is  a 
faster  decrease  of  the  analysis  error  in  the  first  two  cycles, 
with  the  exception  of  salinity.  There  is  also  a  slight  in¬ 
crease  in  the  analysis  error  in  temperature  (fifth  cycle) 
and  velocity  (fifth  and  sixth  cycles).  The  differences  in  the 
curves  from  Figs.  6  and  7  are  mostly  due  to  the  fact  that 
the  correction  of  the  model  trajectory  in  the  strong  con¬ 
straint  uses  nonlinear  dynamics,  whereas  linearized  dy¬ 
namics  are  used  in  the  weakly  strong  constraint. 


b.  Weak  constraint:  Controlling  IC  and  forcing 

In  the  third  experiment  (Exp3),  the  assimilation  is 
carried  out  with  the  wrong  initial  conditions  used  in  the 
experiments  above  and  incorrect  forcing  (described 
below)  using  the  representer  method.  The  experiment 
is  set  up  by  running  the  model  in  the  second  month 
using  the  wrong  initial  conditions  for  the  strong  con¬ 
straint  experiment  above,  and  the  30-day  atmospheric 
forcing  starting  from  1  July.  This  rather  drastic  choice 
of  the  wrong  forcing  (just  as  with  the  wrong  initial 
conditions  in  the  strong  constraint  experiment)  is 
intended  to  test  the  robustness  of  the  assimilation  sys¬ 
tem.  Usually,  a  perturbation  is  added  to  the  true  forc¬ 
ing.  The  initial  condition  errors  are  the  same  as  those 
of  the  strong  constraint  experiment,  and  the  model 
errors  are  as  described  in  section  3c.  Results  of  assim¬ 
ilating  with  wrong  ICs  and  forcing  fields  show  that  for 
surface  fields  (temperature,  height,  and  velocity)  in 
Fig.  8,  the  assimilated  solution  has  substantial  inac¬ 
curacies  after  the  first  cycle,  although  the  solution  shows 
much  better  agreement  with  the  truth  at  the  end  of  the 
last  cycle. 

The  fit  to  the  observations  metric  in  Fig.  9  shows  that 
the  assimilation  is  fitting  temperature,  salinity,  and  SSH 
within  respective  standard  deviations,  and  the  fit  to  ve¬ 
locity  is  borderline  with  the  standard  deviation.  In  Fig.  8 
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Fig.  11.  Time  series  of  (a),(b)  surface  wind  stress  and  (c),(d)  heat  flux  for  the  months  of  July 
(red  lines)  and  August  (black  lines)  at  the  (left)  Mland  (right)  M2  buoys  in  Monterey  Bay.  The 
August  forcing  is  the  true  forcing. 


errors  of  about  IK  (two  standard  deviations)  still  re¬ 
main  in  the  surface  temperature,  which  seems  to  con¬ 
tradict  the  metric  in  Fig.  9.  The  latter,  however,  is  an 
average  that  does  not  reveal  the  inaccuracies  of  Fig.  9 
due  to  there  being  a  better  fit  to  the  observations  in  the 
lower  layers.  This  is  shown  in  Fig.  10,  where  it  can  be 
seen  that  both  temperature  and  salinity  errors  from  the 
free  run  extended  to  more  than  100  and  50  m,  respec¬ 
tively.  Assimilation  errors  are  significantly  reduced  in 
magnitude  and  do  not  extend  below  50  m.  That  the  re¬ 
maining  errors  in  the  assimilation  are  confined  near  the 
surface  may  be  due  to  the  drastic  choice  of  wrong  forcing 
fields  and  rather  small  standard  deviations  for  the  model 
error  terms. 

The  time  series  of  the  wind  stress  and  heat  flux  for  the 
true  and  erroneous  July  forcings  are  shown  in  Fig.  11  at 
the  locations  of  the  Ml  and  M2  buoys  in  Monterey  Bay. 
The  forcing  fields  display  large  differences  in  the  first  7 
and  the  last  11  days  of  the  assimilation  window.  In  be¬ 
tween  those  days,  the  wind  stress  generally  is  in  the  same 
direction  in  both  forcings,  but  differences  persist  in  the 
magnitude.  Expressing  the  July  forcing  in  error  per¬ 
centage  shows  that  on  average  it  is,  relative  to  the  true 
forcing,  more  than  100%  in  error  over  all  the  model 
domain. 


In  light  of  the  significantly  high  errors  in  the  July 
forcing  and  the  very  low  standard  deviations  in  the  as¬ 
similation,  this  weak  constraint  experiment  is  repeated 
in  two  ways.  First,  the  July  forcing  is  used  in  the  assim¬ 
ilation  while  the  model  error  standard  deviations  are 
increased  by  a  factor  of  2  (Exp4).  Note  that  these  stan¬ 
dard  deviations  are  still  significantly  lower  than  the  ac¬ 
tual  errors  that  the  July  forcing  brings  into  the  system. 
Second,  perturbed  forcing  fields  (instead  of  the  1 -month 
delayed  case)  representing  30%  of  the  July  forcing  are 
added  to  the  true  forcing,  and  the  assimilation  is  carried 
out  with  the  10%  errors  prescribed  earlier  (Exp5). 

Assimilation  results  from  these  experiments  are 
shown  in  comparison  to  the  original  weak  constraint 
experiment  (Exp3)  for  temperature  profiles  and  surface 
velocity  and  height.  It  is  shown  in  Fig.  12  that  when  the 
assimilation  is  carried  out  with  the  wrong  July  forcing, 
the  quality  of  the  analysis  slightly  improves  as  the  model 
error  standard  deviations  are  increased  by  a  factor  of  2, 
from  10%  to  20%.  This  result  shows  that  the  assimilation 
system  is  able  to  overcome  significantly  high  errors  in 
the  forcing  to  correct  the  model  trajectory  in  each  cycle 
with  significantly  low  model  error  specifications.  The 
results  of  the  assimilation  get  even  better  when  the 
forcing  error  is  dropped  to  30  % ,  while  keeping  the  model 
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Fig.  12.  Profile  time  series  of  temperature  differences  (a)-(c)  between  the  true  solution  and  the  free  run  and  (d)-(f)  between  the  true  and 
assimilated  solutions  from  (left)  Exp3,  (center)  Exp4,  and  (right)  Exp5.  Time  series  are  shown  for  the  last  two  cycles. 


error  to  10%.  That  is,  when  the  model  error  is  set  to 
relatively  similar  levels  as  the  actual  errors,  the  assimi¬ 
lation  fits  the  observations  with  greater  accuracy.  Similar 
results  are  obtained  with  salinity  profiles  (not  shown). 


The  improvement  of  the  assimilation  is  also  assessed 
for  the  analyzed  SSH  and  surface  velocities  by  looking  at 
the  difference  of  these  fields  between  the  truth  and  the 
analysis  from  the  three  weak  constraint  experiments.  As 
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Fig.  13.  SSH  and  surface  velocity  vector  differences  between  the  truth  and  (a)  Exp3,  (b)  Exp4,  and  (c)  Exp5,  on  21  Aug. 
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Fig.  14.  Time  series  of  the  fit  to  the  observations  metric  from  (a)  Exp3,  (b)  Exp4,  and  (c)  Exp5. 


can  be  seen  in  Fig.  13,  the  analysis  errors  of  these  surface 
fields  are  similar  between  the  original  weak  constraint 
experiment  and  the  one  with  doubled  model  error 
standard  deviations,  and  the  analysis  error  is  much  lower 
for  the  experiment  with  30%  perturbation  on  the  forcing 
and  the  model  error  standard  deviations  set  at  30%. 


The  overall  assimilation  performance  is  shown  in 
Fig.  14  as  the  fit  to  the  observations  metric.  Unlike 
similar  figures  shown  above  for  individual  observation 
types,  the  metric  in  Fig.  14  combines  all  observations.  It 
confirms  the  assessment  of  the  three  weak  constraint 
experiments  described  above  for  the  temperature 
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Fig.  15.  As  in  Fig.  6,  but  for  Exp6. 
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Fig.  16.  As  in  Fig.  5,  but  for  only  Exp6. 


profiles  and  the  surface  height  and  velocity.  1)  The 
original  weak  constraint  assimilation  experiment  with 
the  wrong  July  forcing  and  only  10%  model  error  stan¬ 
dard  deviations  is  able  to  correct  the  model  trajectory 
and  steer  it  toward  the  observations  so  that  by  the  end  of 
the  first  cycle  the  analysis  error  is  within  the  observa¬ 
tions  standard  deviation.  The  associated  forecast  is  also 
significantly  improved;  from  the  third  cycle  and  forward, 
the  forecast  error  is  about  an  observation  standard  de¬ 
viation.  2)  There  is  a  marginal  improvement  (slightly 
lower  analysis  error)  of  the  assimilation  by  increasing 
the  model  error  standard  deviations  by  a  factor  of  2,  as 
well  as  a  marginal  improvement  of  the  5-day  forecast. 
3)  Moreover,  the  improvement  of  the  assimilation  is 
quite  significant  for  the  experiment  with  30%  pertur¬ 
bation  of  the  forcing  and  10%  model  error  standard 
deviations,  from  the  second  cycle  onto  the  last.  The 
same  is  true  for  the  5-day  forecast  associated  with  this 
assimilation  experiment. 

c.  Forcing  alone 

It  may  be  argued  that  the  experiments  with  the  wrong 
July  forcing  are  fitting  the  observations  within  the  ob¬ 
servation  standard  deviation  because  of  the  contribution 
of  the  subsurface  where  forecast  and  analysis  errors  are 
significantly  lower  than  at  the  surface,  and  the  5 -day  cycle 
length  is  not  long  enough  for  the  wrong  surface  forcing  to 
drastically  and  adversely  impact  the  forecast. 

A  sixth  experiment  (Exp6)  is  set  up  that  seeks  to 
correct  only  the  external  surface  forcing  by  assimilating 


only  surface  observations.  This  experiment  uses  correct 
initial  conditions,  the  30%  perturbation  on  the  forcing, 
and  the  10%  model  error  standard  deviation  prescribed 
only  at  the  surface.  According  to  Fig.  11,  the  30% 
forcing  perturbation  produces  errors  that  are  most  no¬ 
ticeable  in  the  heat  flux  and,  therefore,  in  the  surface 
temperatures.  The  fit  to  the  observations  metric  for  this 
experiment  shows,  in  Fig.  15,  that  for  salinity  and  SSH, 
both  the  free-run  and  forecast  errors  are  always  within 
an  observation  standard  deviation  and,  thus,  the  analy¬ 
sis.  The  same  is  true  for  velocity  until  the  last  two  cycles, 
where  the  free-run  errors  increase  to  slightly  more  than 
a  standard  deviation,  and  the  assimilation  subsequently 
corrects  them.  The  major  impact  of  the  assimilation  is  in 
the  surface  temperature,  where  errors  as  high  as  two 
observation  standard  deviations  in  the  free  run  and  the 
forecast  are  reduced  to  less  than  a  standard  deviation  in 
the  analysis. 

Surface  temperature,  elevation,  and  velocity  fields  (at 
the  end  of  the  first  and  the  fourth  cycles)  in  Fig.  16  show 
that  the  assimilation  accurately  recovers  all  the  circu¬ 
lation  features,  albeit  for  slightly  higher  temperatures 
at  isolated  locations.  Although  these  temperature  in¬ 
accuracies  may  be  as  high  as  1 K,  they  are  isolated  in  the 
sense  that  they  do  not  dominate  the  overall  fit  to  the 
observations  metric  shown  in  Fig.  15. 

Actual  forcing  errors  and  the  retrieved  errors  from  the 
assimilation  are  compared  at  the  end  of  the  first  and 
fourth  cycles  in  Fig.  17.  It  can  be  seen  that  the  assimilation 
does  not  recover  the  wind  stress  errors:  corrections  to  the 
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Fig.  17.  (top)  Actual  and  (bottom)  retrieved  errors  in  the  surface  heat  flux  and  wind  stress  at  the  end  of  the  (a),(c)  first  and  (b),(d) 

fourth  cycles. 


wind  stress  are  negligible  compared  to  the  actual  errors. 
The  assimilation  corrects  the  heat  flux  better  than  it  does 
the  wind  stress,  even  though  the  magnitude  of  the  cor¬ 
rection  is  lower  than  the  actual  errors.  It  should  be  em¬ 
phasized  that  the  assimilation  used  model  error  standard 
deviations  at  10%  of  the  actual  forcing  fields,  while  the 
actual  errors  were  set  at  30%.  The  discrepancy  in  error 
levels  explains  in  part  the  difference  between  the  re¬ 
trieved  and  actual  errors. 

d.  Shorter  cycle  strong  constraint 

In  the  seventh  experiment  (Exp7),  we  investigate 
the  possibility  of  improving  the  performance  of  the 
strong  constraint  assimilation  by  resorting  to  a  shorter 
assimilation  window.  This  idea  seems  reasonable  from 
the  tangent  linear  model  (TLM)  point  of  view:  the 
tangent  linear  approximation  underlying  the  variational 
method  is  more  accurate  and  thus  more  stable  over 
a  shorter  cycle.  It  should  be  noted,  however,  according 
to  Fig.  1  in  section  3,  that  the  TLM  stability  exceeds  10 
days  for  initial  perturbations.  Therefore,  the  original 
strong  constraint  experiment  (Expl),  which  used  an 


assimilation  window  of  5  days,  could  not  have  been 
negatively  affected  by  linear  error  growth,  and  a  strong 
constraint  experiment  with  an  assimilation  window  shorter 
than  5  days  should  not  yield  better  results.  To  verify  this, 
Exp7  is  set  up  just  as  in  Expl,  but  with  a  3-day  assimilation 
window. 

Results  in  Fig.  18  show  that  there  is  no  substantial 
improvement  from  the  5-  to  the  3 -day  strong  constraint 
assimilation  windows,  except  for  the  velocity  in  the  first 
half  of  the  month-long  experiment.  Note  that  in  that  first 
half  neither  Expl  nor  Exp7  fits  the  observations  to 
within  a  standard  deviation.  Once  that  begins  to  happen 
in  the  second  half  of  the  month,  both  experiments  have 
the  same  accuracy. 

e.  Strong  constraint  with  forcing  errors 

The  strong  constraint  performance  is  further  in¬ 
vestigated  within  the  context  of  both  initial  conditions 
and  forcing  errors.  This  follows  the  design  of  Exp5  with 
the  only  exception  that  the  assimilation  is  carried  out 
with  strong  constraint  using  the  representer  method. 
The  results  of  Exp8  are  shown  in  Fig.  19,  where  the 
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Fig.  18.  As  in  Fig.  6,  but  for  comparison  between  the  free  run  and  analyses  from  Expl 

and  Exp7. 


strong  and  weak  constraint  analyses  are  compared.  It 
can  be  seen  that  the  weak  constraint  assimilation  fits  the 
observations  better  than  the  strong  constraint,  owing  to 
its  higher  number  of  degrees  of  freedom.  It  is  expected 
of  the  strong  constraint  to  correct  the  model  trajectory 
by  adjusting  the  erroneous  initial  conditions.  However, 
if  the  model  is  driven  by  an  erroneous  forcing,  the  latter 
will  steer  the  model  trajectory  away  from  the  observa¬ 
tions  even  if  the  initial  conditions  were  perfectly 
corrected.  This  is  seen  in  Fig.  19,  especially  for  the  tem¬ 
perature  and  velocity,  by  the  decrease  of  the  analysis 
error  at  the  beginning  of  the  each  cycle  followed  by 
a  gradual  increase.  It  should  also  be  noted  that  the  strong 
constraint  fits  the  salinity  and  SSH  observations  well,  al¬ 
though  not  as  well  as  the  weak  constraint.  This  can  be 
attributed  to  the  facts  that  salinity  forcing  is  poorly  known 
and  as  such  its  perturbations  for  forcing  errors  are  small, 
and  there  is  no  forcing  perturbation  for  the  SSH  equation 
because  it  is  treated  as  a  conservative  quantity. 

f  Sampling  at  real  observation  locations 

In  most  of  the  experiments  above,  the  assimilation  ex¬ 
periments  are  assessed  at  the  observation  locations. 
However,  the  availability  of  the  true  solution  makes  it 
possible  to  evaluate  the  ability  of  the  assimilation  to 


correct  the  model  everywhere  else  in  the  domain.  A  weak 
constraint  assimilation  experiment  (Exp9)  is  carried  with 
data  sampling  and  error  settings  as  in  the  real-data  ex¬ 
periments  described  in  the  second  part  of  this  paper 
(Ngodock  and  Carrier  2014,  hereafter  Part  II),  and 
a  comparison  to  the  true  solution  over  the  entire  domain  is 
made  to  assess  the  assimilation’s  ability  to  reconstruct  the 
true  solution  away  from  the  assimilated  observations. 
Exp9  is  run  for  a  60-day  window  (in  5-day  cycles)  to  allow 
the  dynamics  to  propagate  the  assimilation  corrections 
through  the  model  domain.  In  Fig.  20  the  normalized 
global  differences  (in  the  three  space  dimensions)  between 
the  weak  constraint  and  true  solutions  are  compared. 

It  can  be  seen  that  initially  the  assimilation  does  not 
compare  well  to  the  true  solution  for  all  of  the  variables. 
The  errors  in  the  salinity  and  SSH  fields  are  lower  than 
those  in  the  temperature  and  velocity  fields,  probably  due 
to  the  same  reasons  mentioned  above  for  Fig.  19.  But  as 
time  progresses,  the  model  dynamics  are  able  to  propa¬ 
gate  the  assimilation  corrections  through  the  model  do¬ 
main  and  the  discrepancy  between  the  assimilation  and 
true  solution  decreases:  salinity  and  SSH  errors  become 
lower  than  1  standard  deviation;  temperature  errors  are 
close  to  1  standard  deviation,  having  started  around  2.0 
standard  deviations;  and  finally  velocity  errors  have 
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Fig.  19.  As  in  Fig.  6,  but  for  comparison  between  the  free-run  and  analyses  from  Exp5 

and  Exp8. 


decreased  to  1.5  standard  deviations  having  started  at  2.5. 
It  is  arguable  that  these  errors  would  have  decreased 
further  to  below  a  standard  deviation;  that  is,  the  assim¬ 
ilation  would  eventually  reproduce  the  true  solution  in 
the  entire  region  had  the  system  run  long  enough. 

5.  Conclusions 

A  4D  variational  system  was  developed  for  assimi¬ 
lating  ocean  observations  with  the  Navy  Coastal  Ocean 
Model.  The  system  is  described  in  this  paper,  along 
with  initial  assimilation  experiments  in  Monterey  Bay 
using  synthetic  observations.  The  assimilation  system  is 
tested  in  a  series  of  twin  data  experiments  to  assess  its 
ability  to  fit  assimilated  and  independent  observations 
by  controlling  the  initial  conditions  and/or  the  external 
forcing  while  assimilating  surface  and/or  subsurface 
observations.  The  minimization  of  the  cost  function 
was  done  with  the  gradient  descent  method  in  all  of 
the  strong  constraint  experiments,  and  with  the  repre¬ 
senter  method  (observation  space)  in  the  weak  con¬ 
straint  experiments. 

It  was  shown  that  in  the  strong  constraint  approach 
the  system  is  able  to  correct  wrong  initial  conditions  to 


fit  the  observations  within  a  standard  deviation  after  two 
5-day  cycles.  After  the  first  cycle,  forecast  (and  thus 
analysis)  errors  remain  low  due  to  the  right  forcing, 
except  in  the  velocity.  The  latter  displays  some  inac¬ 
curacies  at  the  end  of  the  first  cycle  that  negatively  affect 
both  the  forecast  and  analysis  in  the  second  cycle. 
However,  the  assimilation  corrects  those  inaccuracies  in 
the  third  and  subsequent  cycles.  The  length  of  the  strong 
constraint  assimilation  window  was  reduced  to  explore 
the  possibility  of  improving  its  performance.  It  was 
noted  that  since  the  TLM  was  shown  to  be  stable  beyond 
10  days,  a  shorter  than  5-day  assimilation  window  would 
not  improve  the  accuracy  of  the  strong  constraint,  and 
that  was  confirmed  by  an  actual  assimilation  experi¬ 
ment.  Finally,  the  strong  constraint  was  compared  to  the 
weak  constraint  in  the  case  of  wrong  initial  conditions 
and  forcing.  It  was  shown  that  although  the  strong 
constraint  could  accurately  correct  the  initial  conditions, 
the  wrong  forcing  would  keep  steering  the  solution  away 
from  the  observations;  only  the  fields  that  had  low 
forcing  perturbations  (salinity)  or  none  (SSH)  could  be 
accurately  fit  by  the  assimilation. 

Several  weak  constraint  experiments  were  carried 
out.  First,  a  completely  wrong  forcing  was  used  in  the 
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Fig.  20.  As  in  Fig.  6,  but  for  normalized  global  differences  (in  the  three  space  dimensions) 
between  the  true  solution  and  Exp9  (red  lines),  the  first  guess  (blue  lines),  and  true  free-run 
(green  lines)  solutions. 


free  run  and  forecast,  while  model  error  standard  de¬ 
viations  were  set  to  10%  of  the  magnitude  of  the  true 
forcing.  In  general,  the  assimilation  was  able  to  fit  the 
observations,  although  some  isolated  discrepancies 
greater  than  twice  the  observations’  standard  deviation 
remained  in  the  analysis.  This  was  mostly  due  to  high 
actual  forcing  errors  and  very  low  model  errors  for  the 
assimilation.  The  second  weak  constraint  experiment 
followed  the  first,  except  with  doubled  model  error 
standard  deviations  in  the  assimilation.  There  was 
a  marginal  improvement  in  the  analysis,  as  the  differ¬ 
ence  between  the  actual  forcing  errors  and  the  pre¬ 
scribed  model  errors  in  the  assimilation  remained  high. 
In  the  third  weak  constraint  experiment,  forcing  errors 
were  computed  as  a  30%  perturbation  of  the  true 
forcing,  and  the  model  error  standard  deviations  were 
kept  at  10%.  There  was  a  significant  improvement  in 
the  accuracy  of  the  analysis  compared  to  the  first  two 
weak  constraint  experiments.  The  fourth  weak  con¬ 
straint  experiment  follows  the  third,  except  only  sur¬ 
face  observations  were  assimilated  and  the  model 
errors  were  also  restricted  to  the  surface.  This  experi¬ 
ment  yielded  accurate  surface  fields,  and  the  forcing 
corrections  showed  that  the  assimilation  corrected  the 


heat  flux  better  than  it  did  the  wind  stress,  even  though 
the  magnitude  of  the  corrected  fluxes  was  lower  than 
the  actual  errors.  This  resulted  from  using  smaller 
model  error  standard  deviations  (a  third  of  the  actual 
errors  magnitude)  in  the  assimilation.  The  assimilation 
system  is  able  to  overcome  significantly  high  errors  in  the 
forcing  to  correct  the  model  trajectory  in  each  cycle  with 
significantly  low  model  error  specifications.  The  results  of 
the  assimilation  get  even  better  when  the  forcing  error  is 
dropped  to  30%,  while  keeping  the  model  error  to  10%. 
That  is,  when  the  model  error  is  set  to  relatively  similar 
levels  as  the  actual  errors,  the  assimilation  fits  the  obser¬ 
vations  with  greater  accuracy.  There  was  a  discrepancy 
between  the  actual  forcing  errors  and  the  retrieved  forc¬ 
ing  errors  from  the  assimilation  that  is  explained  in  part 
by  the  difference  between  the  actual  and  prescribed  error 
magnitudes  in  the  assimilation. 

Finally,  a  weak  constraint  assimilation  experiment 
was  carried  with  data  sampling  and  error  settings  as 
in  the  real-data  experiments  described  in  Part  II,  and 
a  comparison  to  the  true  solution  over  the  entire 
domain  was  made  to  assess  the  assimilation’s  ability 
to  reconstruct  the  true  solution  away  from  the  assim¬ 
ilated  observations.  It  was  shown  that  this  global 
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analysis  error  was  high  initially,  especially  in  tempera¬ 
ture  and  velocity.  But  as  time  progressed,  the  discrep¬ 
ancy  between  the  assimilation  and  the  true  solution 
decreased  because  the  model  dynamics  were  able  to 
propagate  the  assimilation  corrections  through  the 
model  domain.  It  is  arguable  that  these  errors  would 
have  decreased  further  to  below  a  standard  deviation, 
that  is,  the  assimilation  would  eventually  reproduce  the 
true  solution  in  the  entire  region  had  the  system  run 
long  enough. 
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APPENDIX 


Discretization  of  NCOM  Equations 

The  discretization  of  NCOM  uses  second-order  in¬ 
terpolation  and  differentiation  as  defined  with  the  fol¬ 
lowing  notation: 
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The  NCOM  equations  are  then  discretized  in  flux  con¬ 
servative  form  as  follows 
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In  (A1)-(A5),  Fu  and  Fv  are  the  horizontal  mixing 
terms;  £atm  and  ftp  are  the  atmospheric  surface  pressure 
and  tidal  potential,  respectively;  and  is  the  surface 
elevation  term  that  can  be  distributed  among  any  of 
the  three  time  levels,  f*  =  (*i£n+1  +  a2£n  +  u3fn_1,  ac¬ 
cording  to  the  temporal  weighting  terms  a3,  a2 ,  or  a3, 
which  are  specified  by  the  user.  The  horizontal  mixing 
coefficients  for  the  velocity  and  scalar  fields  (tempera¬ 
ture  and  salinity)  are  Am  and  Ah,  respectively;  likewise, 
Km  and  KM  are  used  for  the  vertical  mixing;  Q  is  a  vol¬ 
ume  flux  source  term  (with  Tsor ,  £sor,  wsor,  and  vsor  as  the 
term  source  values);  Qr  is  the  solar  radiation;  y  is 
a  function  describing  the  solar  extinction;  Ax,  Ay,  and  A z 
denote  the  grid-cell  dimensions  defined  at  the  center  of 
the  grid  cells;  and  the  superscripts  u ,  u,  and  w  indicate 


the  grid-cell  dimensions  computed  at  those  velocity  lo¬ 
cations  on  the  staggered  Arakawa  C  grid.  The  Coriolis 
term  is  /;  p0  and  pt  are  the  reference  density  of  seawater 
and  the  internal  pressure,  respectively;  and  the  hori¬ 
zontal  advection  velocity  terms  are  given  by  ua  and  va. 
The  term  CCUrv  is  used  to  correct  the  horizontal  advec¬ 
tion  of  momentum  for  the  horizontal  curvature  of  the 
grid.  It  is  calculated  as 
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The  horizontal  mixing  terms  for  the  momentum  equa¬ 
tions  are  given  by 
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where  the  mixing  coefficient  is  modeled  according  to  Smagorinsky’s  formula: 
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with  the  magnitude  of  the  eddy  coefficient  being  scaled  by 
the  constant  CSmag*  The  vertical  mixing  coefficients  are 
computed  using  the  turbulence  closure  method  of  Mellor 
and  Yamada  in  either  its  2  or  2.5  version. 

The  computation  for  the  free-surface  mode  is  governed 
by  the  following  equations: 

-^82t{Duu)  =  -AyuDugSx(air+1  +  a2C” 

+  a3r_1)  +  DuG~u,  (A10) 
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where  j31?  fi2,  and  fi3  are  positive  constants  define  by 
the  user  with  +  fi2  +  fi3  =  1,  and  DUGU  and  DVGV  are 
the  vertical  integrals  of  all  the  terms  on  the  right-hand 
sides  of  (Al)  and  (A2),  respectively,  with  the  excep¬ 
tion  of  the  surface  elevation  gradient  terms  and  the 
vertical  mixing,  and  Du  =  Dx  and  Dv  =  Dy .  The  free- 
surface  mode  (A12)  is  solved  by  first  substituting 
( Duu)n+1  and  ( Dvv)n+1  into  the  time-discretized  (A10) 
and  (All)  into  (A12),  resulting  in  an  elliptic  equation 
that  is  solved  for  the  surface  elevation  at  time  level 
n  +  1,  which  is  then  substituted  back  into  (A10)  and 
(All)  to  compute  the  barotropic  transports  Duu 
and  Dvv ,  from  which  the  barotropic  velocities  are 
obtained. 

The  vertical  discretization  uses  a  combination  of 
sigma  layers  and  z  levels  in  a  three-tiered  distribu¬ 
tion  with  (i)  free  sigma  layers  near  the  surface  that 
expand  and  contract  with  the  free  surface  elevation, 
(ii)  fixed  sigma  layers  that  do  not  vary  with  the  free 
surface,  and  (iii)  fixed  z  levels  that  allow  for  partial 
bottom  cells  for  a  better  match  with  the  bottom 
topography. 
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